View 20726_799585.PDF datasheet online --- IC-ON-LINE

Datasheet File OCR Text:

amd-k6 mmx enhanced processor multimedia technology preliminary information tm tm
trademarks amd, the amd logo, and combinations thereof are trademarks of advanced micro devices, inc. risc86 is a registered trademark, and k86, amd-k5, amd-k6, and the amd-k6 logo are trademarks of advanced micro devices, inc. windows nt is a trademark of microsoft corporation. mmx is a trademark of the intel corporation. other product names used in this publication are for identification purposes only and may be trademarks of their respective companies. ? 2000 advanced micro devices, inc. all rights reserved. advanced micro devices, inc. ("amd") reserves the right to make changes in its products without notice in order to improve design or performance characteristics. the information in this publication is believed to be accurate at the time of publication, but amd makes no representations or warranties with respect to the accuracy or completeness of the contents of this publication or the information contained herein, and reserves the right to make changes at any time, without notice. amd disclaims responsibility for any consequences resulting from the use of the information included in this publication. this publication neither states nor implies any representations or warranties of any kind, including but not limited to, any implied warranty of merchantability or fitness for a particular purpose. amd products are not authorized for use as critical components in life support devices or systems without amds written approval. amd assumes no liability whatsoever for claims associated with the sale or use (including the use of engineering samples) of amd products except as provided in amds terms and conditions of sale for such product. preliminary information
contents iii 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information contents 1 amd-k6? processor multimedia technology introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 multimedia technology architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 2 key functionality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 register set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 data types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 instruction formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2 programming considerations feature detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 task switching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 exceptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 mixing mmx ? and floating-point instructions . . . . . . . . . . . 14 prefixes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3 mmx? instruction set emms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 movd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 movq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 packssdw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 packsswb. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 packuswb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 paddb. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 paddd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 paddsb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 paddsw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 paddusb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 paddusw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 paddw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
iv contents amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information pand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 pandn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 pcmpeqb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 pcmpeqd. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 pcmpeqw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 pcmpgtb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 pcmpgtd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 pcmpgtw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 pmaddwd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 pmulhw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 pmullw. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 por . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 pslld . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 psllq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 psllw. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 psrad. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 psraw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 psrld . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 psrlq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 psrlw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 psubb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 psubd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 psubsb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 psubsw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 psubusb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 psubusw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 psubw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 punpckhbw. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 punpckhdq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 punpckhwd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 punpcklbw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 punpckldq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 punpcklwd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 pxor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
revision history v 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information revision history date rev description july 1996 a initial release. march 1997 b removed paragraph from mixing mmx? and floating-point instructions on page 14 that contained inaccuracies pertaining to floating-point tag words. june 1997 c revised stack exception entry in table 1, mmx? instruction exceptions, on page 13 to include real mode and virtual-8086 mode. june 1997 c revised note 2 on page 13 regarding floating-point exceptions. june 1997 c replaced overbar with # to indicate active-low signals. june 1997 c revised document to comply with mmx trademark. june 1997 c revised description of emms instruction on page 18. jan 2000 d changed mem64 to mem32 for punpcklbw, punpcklwd, and punpckldq.
vi revision history amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information
amd-k6? processor multimedia technology 1 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information 1 amd-k6? processor multimedia technology introduction next generation pc performance requirements are being driven by emerging multimedia and communications software. 3d graphics, video, audio, and telephony capabilities are evolving across education, entertainment, and internet applications. as multimedia applications continue to proliferate in the marketplace, pc systems suppliers are being challenged to deliver multimedia-enabled pc solutions covering all mainstream price/performance points. in response to the growing need to provide improved pc multimedia capabilities, the amd-k6? mmx? enhanced processor is the first member in the amd family of processors to incorporate a robust multimedia technology that is fully software compatible with the mmx? technology as defined by intel. this multimedia technology enables scaleable multimedia capabilities across a broad range of pc system price/performance points. the amd-k6 processor features a decode-decoupled superscalar microarchitecture and state-of-the-art design techniques to deliver true sixth-generation performance while maintaining full x86 binary software compatibility. an x86 binary-compatible processor implements the industry-standard x86 instruction set by decoding and executing the x86
2 amd-k6? processor multimedia technology amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information instruction set as its native mode of operation. only this native mode enables delivery of maximum performance when running pc software. the amd-k6 processor delivers leading-edge performance to mainstream pc systems running industry-standard x86 software. the amd-k6 processor implements advanced design techniques like instruction pre-decoding, dual x86 opcode decoding, single-cycle internal risc operations, parallel execution units, out-of-order execution, data forwarding, register renaming, and dynamic branch prediction. in other words, the amd-k6 is capable of issuing, executing, and retiring multiple x86 instructions per cycle, resulting in superior scaleable performance. this document describes the multimedia technology of the amd-k6 processor, including data types, instructions, and programming considerations. multimedia technology architecture the multimedia technology in the amd-k6 mmx enhanced processor is designed to accelerate media and communication applications. specialized applications that use music synthesis, speech synthesis, speech recognition, audio and video compression and decompression, full motion video, 2d and 3d graphics, and video conferencing, can take advantage of the amd-k6 processor multimedia technology. the multimedia technology implements new instructions, new data types, and powerful parallel processing (single instruction multiple data, simd) techniques that can significantly increase the performance of these applications. key functionality at the lowest levels, multimedia applications (audio, video, 3d graphics, and telephony, etc.) contain many similar functions. when these functions are performed on a processor that does not have mmx capability, the processor is heavily burdened by the computational requirements of this information. processors executing the mmx instructions increase the performance of
amd-k6? processor multimedia technology 3 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information multimedia applications. this performance increase is a direct result of the increased multimedia bandwidth of the processor. multimedia applications must process large amounts of data. parallel data computing is exemplified by applications that manipulate screen pixel information. instead of acting on one pixel at a time, multimedia technology enables the system to act on multiple pixels simultaneously. this single instruction multiple data (simd) model is a key feature of mmx technology. the amd-k6 processor multimedia technology architecture includes four new mmx data types, 57 new mmx instructions, eight new 64-bit mmx registers, and an simd processing pipeline. the multimedia technology is compatible with existing x86 applications. the 57 new mmx instructions include arithmetic functions, packing and unpacking functions, logical operations, and moves. these are the basic functions that are most commonly used in repetitive computational multimedia programs. multimedia applications often use smaller operands8-bit data is commonly used for pixel information and 16-bit data is used for audio samples. the new mmx registers allow data to be packed into 64-bit operands. for example, 8-bit data (1 byte) can be packed in sets of eight in a single 64-bit register, and all eight bytes can be operated on simultaneously by a single mmx instruction. for 256-color video modes, this translates to computing eight pixels per instruction. when an entire screen is being re-drawn, these pixel manipulation routines often use highly repetitive loops. parallel processing of eight pieces of data can reduce the processing time of a code loop by up to a factor of eight. multimedia applications frequently multiply and accumulate data. the multimedia technology provides instructions that add, multiply, and even combine these operations. for example, the pmaddwd instruction can multiply and then add words of data in a single instruction that uses far less processor cycles than the equivalent x86 operations.
4 amd-k6? processor multimedia technology amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information executing mmx ? instructions a programmer must approach the use of mmx instructions differently, based on whether the code being developed is at the system level or at the application level. the details of these differences are discussed in programming considerations on page 9 . before using the mmx instructions, the programmer must use the cpuid instruction to determine if the processor supports multimedia technology. see the amd processor recognition application note , order# 20734, for more information. function 1 (eax=1) of the amd-k6 processor cpuid instruction returns the processor feature bits in the edx register. software can then test bit 23 of the feature bits to determine if the processor supports the multimedia technology. if bit 23 is set to 1, mmx instructions are supported. all amd-k6 processors have bit 23 set. once it is determined that multimedia technology is supported, subsequent code can use the mmx instructions. alternatively, the amd 8000_0001h extended cpuid function can be used to test whether the processor supports multimedia technology. after a module of mmx code has executed, the programmer must empty the mmx state by executing the emms command. because the mmx registers share the floating-point registers, an instruction is needed to prevent mmx code from interfering with floating-point. the emms command clears the multimedia state and resets all the floating-point tag bits. emptying the mmx state sets the floating-point tag bits to empty (all ones), which marks the mmx/fp registers as invalid and available. register set the amd-k6 processor implements eight new 64-bit mmx registers. these registers are mapped on the floating-point registers. as shown in figure 1 on page 5, the new mmx instructions refer to these registers as mmreg0 to mmreg7. mapping the new mmx registers on the floating-point stack enables backwards compatibility for the register saving that must occur as a result of task switching.
amd-k6? processor multimedia technology 5 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information figure 1. mmx? registers aliasing the mmx registers onto the floating-point stack registers provides a safe way to introduce this new technology. instead of needing to modify operating systems, new mmx applications can be supported through device drivers, mmx libraries, or dll files. see the programming considerations section of this document for more information. current operating systems have support for floating-point operations. using the floating-point registers for mmx code is an ingenious way of implementing automatic support for mmx instructions. every time the processor executes an mmx instruction, all the floating-point register tag bits are set to zero (00b=valid). setting the tag bits after every mmx instruction prevents the processor from having to perform extra tasks. these extra tasks are normally executed on floating-point registers when the tag field is something other than 00b. if a task switch occurs during an mmx or floating-point instruction, the control register (cr0) task switch (ts) bit is set to 1. the processor then generates an interrupt 7 (int 7 device not available) when it encounters the next floating-point or mmx instruction, allowing the operating system to save the state of the mmx/fp registers. tag bits 63 0 mmreg0 mmreg7 mmreg1 mmreg6 mmreg5 mmreg2 mmreg3 mmreg4 xx xx xx xx xx xx xx xx
6 amd-k6? processor multimedia technology amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information if there is a task switch when mmx applications are running with older applications that do not include mmx instructions, the mmx/fp register state is still saved automatically through the int 7 handler. data types the amd-k6 processor multimedia technology uses a packed data format. the data is packed in a single, 64-bit mmx register or memory operand as eight bytes, four words, or two double words. each byte, word, doubleword, or quadword is an integer data type. the form of an instruction determines the data type. for example, the mov instruction comes in two different forms movd moves 32 bits of data and movq moves 64 bits of data. the four new data types are defined as follows: packed byte eight 8-bit bytes packed into 64 bits signed integer range(C2 7 to 2 7 C1) unsigned integer range(0 to 2 8 C1) packed word four 16-bit words packed into 64-bits signed integer range(C2 15 to 2 15 C1) unsigned integer range(0 to 2 16 C1) packed two 32-bit doublewords packed into 64 bits doubleword signed integer range(C2 31 to 2 31 C1) unsigned integer range(0 to 2 32 C1) quadword one 64-bit quadword signed integer range(C2 63 to 2 63 C1) unsigned integer range(0 to 2 64 C1) figure 2 on page 7 shows the four new data types.
amd-k6? processor multimedia technology 7 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information figure 2. mmx? data types instructions the amd-k6 processor multimedia technology includes 57 new mmx instructions. these new instructions are organized into the following groups: n arithmetic n empty mmx registers n compare n convert (pack/unpack) n logical n move n shift the following mnemonics are used in the instructions: n p packed data n b byte n w word n d doubleword n q quadword n s signed 63 56 55 47 63 39 31 23 15 7 47 63 63 31 15 48 40 32 24 16 0 0 32 48 32 16 0 0 8 31 (8 bits x 8) packed bytes (16 bits x 4) p acked words (32 bits x 2) packed double words (64 bits x 1 ) quadword b2 b1 b4 b3 b5 b0 b6 b7 w0 w1 w2 w3 d0 d1 q0
8 amd-k6? processor multimedia technology amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information n u unsigned n ss signed saturation n us unsigned saturation for example, the mnemonic for the pack instruction that packs four words into eight unsigned bytes is packuswb. in this mnemonic, the us designates an unsigned result with saturation, and the wb means that the source is packed words and the result is packed bytes. the term saturation is commonly used in multimedia applications. saturation allows mathematical limits to be placed on the data elements. if a result exceeds the boundary of that data type, the result is set to the defined limit for that instruction. a common use of saturation is to prevent color wraparound. instruction formats all mmx instructions, except the emms instruction that uses no operands, are formatted as follows: instruction mmreg1, mmreg2/mem64 the source operand (mmreg2/mem64) can be either an mmx register or a memory location. the destination operand (mmreg1) can only be an mmx register. the movd and movq instructions also have the following acceptable formats: movd mmreg1, mreg32/mem32 movd mreg32/mem32, mmreg1 movq mem64, mmreg1 in the first example, the source operand (mreg32/mem32) can be either an integer register or a 32-bit memory address. the destination operand (mmreg1) can only be an mmx register. the second example has the source operand as an mmx register. the destination operand (mreg32/mem32) can be either an integer register or a 32-bit memory address. the third example has the source operand as an mmx register and the destination operand as a 64-bit memory location the shift instructions can also utilize an immediate source operand. it is designated as imm8 . psrlw mmreg1, imm8
9 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information 2 programming considerations this chapter describes considerations for programmers writing operating systems, compilers, and applications that utilize mmx instructions as implemented in the amd-k6 mmx enhanced processor. feature detection to use the amd-k6 processor multimedia technology, the programmer must determine if the processor supports them. the cpuid instruction gives programmers the ability to determine the presence of multimedia technology on the processor. software must first test to see if the cpuid instruction is supported. for a detailed description of the cpuid instruction, see the amd processor recognition application note, order# 20734. the presence of the cpuid instruction is indicated by the id bit (21) in the eflags register. if this bit is writable, the cpuid instruction is supported. the following code sample shows how to test for the presence of the cpuid instruction.
10 programming considerations amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information pushfd ; save eflags pop eax ; store eflags in eax mov ebx, eax ; save in ebx for later testing xor eax, 00200000h ; toggle bit 21 push eax ; put to stack popfd ; save changed eax to eflags pushfd ; push eflags to tos pop eax ; store eflags in eax cmp eax, ebx ; see if bit 21 has changed jz no_cpuid ; if no change, no cpuid if the processor supports the cpuid instruction, the programmer must execute the standard function, eax=0. the cpuid function returns a 12-character string that identifies the processors vendor. for amd processors, standard function 0 returns a vendor string of authentic amd. this string requires the software to follow the amd definitions for subsequent cpuid functions and the values returned for those functions. the next step is for the programmer to determine if mmx instructions are supported. function 1 of the cpuid instruction provides this information. function 1 (eax=1) of the amd cpuid instruction returns the feature bits in the edx register. if bit 23 in the edx register is set to 1, mmx instructions are supported. the following code sample shows how to test for mmx instruction support. mov eax,1 ; setup function 1 cpuid ; call the function test edx, 800000 ; test 23rd bit jnz yes_mm ; multimedia technology supported alternatively, the extended function 1 (eax=8000_0001h) can be used to determine if mmx instructions are supported. mov eax,8000_0001h ; setup extended function 1 cpuid ; call the function test edx, 800000 ; test 23rd bit jnz yes_mm ; multimedia technology supported
programming considerations 11 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information task switching a task switch is an event that occurs within operating systems that allows multiple programs to be executed in parallel. most modern operating systems utilizing task switching, are called multitasking operating systems. there are two types of multitasking operating systems cooperative and preemptive. cooperative multitasking in cooperative multitasking operating systems, applications do not care about other tasks that may be running. each task assumes that it owns the machine state (processor, registers, i/o, memory, etc.). in addition, these tasks must take care of saving their own information (i.e., registers, stacks, states) in their own memory areas. the cooperative multitasking operating system does not save operating state information for the applications. there are different types of cooperative multitasking operating systems. some of these operating systems perform some level of state saves, but this state saving is not always reliable. all software engineers programming for a cooperative multitasking environment must save the mmx or floating-point states before relinquishing control to another task or to the operating system. the fsave and frstor commands are used to perform this task. figure 4 illustrates this task switching process. note: some cooperative operating systems may have api calls to perform these tasks for the application. figure 3. cooperative task switching program must restore states frstor code executing code module finished program must save states fsave goto task 1 executing mmx ?/fp code program must restore states frstor executing code task 1 task 2 task 1 task switch to task 2 program must save states fsave
12 programming considerations amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information preemptive multitasking in preemptive multitasking operating systems like os/2, windows nt?, and unix, the operating system handles all state and register saves. the application programmer does not need to save states when programming within a preemptive multitasking environment. the preemptive multitasking operating system sets aside a save area for each task. in a preemptive multitasking operating system, if a task switch occurs, the operating system sets the control register 0 (cr0) task switch (ts) bit to 1. if the new task encounters a floating-point or mmx instruction, an interrupt 7 (int 7, device not available) is generated. the int7 handler saves the state of the first task and restores the state of the second task. the int7 handler sets the cr0.ts to 0 and returns to the original floating-point or mmx instruction in the second task. figure 4 illustrates this task switching process. figure 4. preemptive task switching executing mmx ?/fp code executing code save task 1 state restore task 2 set cr0.ts=0 return to task 2 mmx/fp code task 1 task 2 int 7 handler task switch to task 2 set cr0.ts=1 encounter mmx/fp code because ts=1 goto int 7 handler
programming considerations 13 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information exceptions table 1 contains a list of exceptions that mmx instructions can generate. the rules for exceptions have not changed in the implementation of mmx instructions. none of the exception handlers need to be modified. note: 1. an invalid opcode exception interrupt 6 occurs if an mmx instruction is executed on a processor that does not support mmx instructions. 2. if a floating-point exception is pending and the processor encounters an mmx instruction, ferr# is asserted and, if cr0.ne = 1, an interrupt 16 is generated. table 1. mmx ? instruction exceptions exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x x x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
14 programming considerations amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information mixing mmx ? and floating-point instructions the programmer must take care when writing code that contains both mmx and floating-point instructions. the mmx code modules should be separated from the floating-point code modules. all code of one type (mmx or floating-point code) should be grouped together as often as possible. to obtain the highest performance, routines should not contain any conditional branches at the end of loops that jump to code of a different type than the code that is currently being executed. in certain multimedia environments, floating-point and mmx instructions may be mixed. for example, if a programmer wants to change the viewing perspective of a three-dimensional scene, the perspective can be changed through transformation matrices using floating-point registers. the picture/pixel information is integer-based and requires mmx instructions to manipulate this information. both mmx and floating-point instructions are required to perform this task. the software must clean up after itself at the end of an mmx code module. the emms instruction must be used at the end of an mmx code module to mark all floating-point registers as empty (11=empty/invalid). in cooperative multitasking operating systems, the emms instruction must be used when switching between tasks. note: in some situations, experienced programmers can utilize the mmx registers to pass information between tasks. in these situations, the emms instruction is not required. the tag bits are affected by every mmx and floating-point instruction. after every mmx instruction except emms, all the tag bits in the floating-point tag word are set to 0. when the emms instruction is executed, all the tag bits in the tag word are set to 1. prefixes all instructions in the x86 architecture translate to a binary value or opcode. this 1 or 2 byte opcode value is different for each instruction. if an instruction is two bytes long, the second byte is called the mod r/m byte. the mod r/m byte is used to further describe the type of instruction that is used.
programming considerations 15 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information the x86 opcode and the mod r/m byte can also be followed by an sib byte. this byte is used to describe the scale, index and base forms of 32-bit addressing. the format of the x86 instruction allows for certain prefixes to be placed before each instruction. these prefixes indicate different types of command overrides. the mmx instructions follow these rules just like all the current existing instructions. this allows for an easy implementation into the x86 architecture. all of the rules that apply to the x86 architecture apply to mmx instructions, including accessing registers, memory, and i/o. most opcode prefixes can be utilized while using mmx instructions. the following prefixes can be used with mmx instructions: n the segment override prefixes (2eh/cs, 36h/ss, 3eh/ds, 26h/es, 64h/fs, and 65h/gs) affect mmx instructions that contain a memory operand. n the lock prefix (f0h) triggers an invalid opcode exception (interrupt 6). n the address size override prefix (67h) affects mmx instructions that contain a memory operand.
16 programming considerations amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information
17 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information 3 mmx? instruction set the following mmx instruction definitions are in alphabetical order according to the instruction mnemonics.
18 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information emms mnemonic opcode description emms 0f 77h clear the mmx state privilege: none registers affected: mmx flags affected: none exceptions generated: the emms instruction is used to clear the mmx state following the execution of a block of code using mmx instructions. because the mmx registers and tag words are shared with the floating-point unit, it is necessary to clear the state before executing code that includes floating-point instructions. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the con- trol register (cr0) is set to 1. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit.
mmx? instruction set 19 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information movd mnemonic opcode description movd mmreg1, reg32/mem32 0f 6eh copy a 32-bit value from the general purpose register or memory location into the mmx register movd reg32/mem32, mmreg1 0f 7eh copy a 32-bit value from the mmx register into the general purpose register or memory location privilege: none registers affected: mmx flags affected: none exceptions generated: the movd instruction moves a 32-bit data value from an mmx register to a general purpose register or memory, or it moves the 32-bit data from a general purpose register or memory into an mmx register. if the 32-bit data to be moved is provided by an mmx register, the instruction moves bits 31C0 of the mmx register into the specified register or memory location. if the 32-bit data is being moved into an mmx register, the instruction moves the 32-bits of data into bits 31C0 of the mmx register and fills bits 63C32 with zeros. related instructions see the movq instruction. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
20 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information movq mnemonic opcode description movq mmreg1, mmreg2/mem64 0f 6fh copy a 64-bit value from an mmx register or memory location into an mmx register movq mmreg2/mem64, mmreg1 0f 7fh copy a 64-bit value from an mmx register into an mmx register or memory location privilege: none registers affected: mmx flags affected: none exceptions generated: the movq instruction moves a 64-bit data value from one mmx register to another mmx register or memory, or it moves the 64-bit data from one mmx register or memory to another mmx register. copying data from one memory location to another memory location cannot be accomplished with the movq instruction. related instructions see the movd instruction. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
mmx? instruction set 21 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information packssdw mnemonic opcode description packssdw mmreg1, mmreg2/mem64 0f 6bh pack with saturation signed 32-bit operands into signed 16-bit results privilege: none registers affected: mmx flags affected: none exceptions generated: the packssdw instruction performs a pack and saturate operation on two signed 32-bit values in the first operand and two signed 32-bit values in the second operand. the four signed 16-bit results are placed in the specified mmx register. the pack operation is a data conversion. the packssdw instruction converts or packs the four signed 32-bit values into four signed 16-bit values, applying saturating arithmetic. if the signed 32- bit value is less than C32768 (8000h), it saturates to C32768 (8000h). if the signed 32-bit value is greater than 32767 (7fffh), it saturates to 32767 (7fffh). all values between C32768 and 32767 are represented with their signed 16-bit value. the first operand must be an mmx register. in addition to providing the first operand, this mmx register is the location where the result of the pack and saturate operation is stored. the second operand can be an mmx register or a 64-bit memory location. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
22 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the packssdw instruction the following list explains the functional illustration of the packssdw instruction: n bits 63C32 of the source operand (mmreg2/mem64) are packed into bits 63C48 of the destination operand (mmreg1). the result is saturated to the largest possible 16-bit negative number because the 32-bit negative source operand (8000_0002h) exceeds the capacity of the signed 16-bit destination operand. n bits 31C0 of the source operand are packed into bits 47C32 of the destination operand. the result is saturated to the largest possible 16-bit positive number because the 32-bit positive source operand (0000_8000h) exceeds the capacity of the 16-bit destination operand. n bits 63C32 of the destination operand are packed into bits 31C16 of the destination operand. the results are not saturated because the 32-bit negative source operand (ffff_8002h) does not exceed the capacity of the 16-bit destination operand. n bits 31C0 of the destination operand are packed into bits 15C0 of the destination operand. the results are not saturated because the 32-bit positive source operand (0000_01fch) does not exceed the capacity of the 16-bit destination operand. related instructions see the packsswb instruction. see the packuswb instruction. see the punpckhwd instruction. see the punpcklwd instruction. 0000 8000 8000h 7fffh 8002h 01fch mmreg1 mmreg2/mem64 mmreg1 0 0 0 63 63 63 0002h 8000h 31 32 31 32 31 32 47 48 15 16 0000 ffff 8002h 01fch indicates a saturated value
mmx? instruction set 23 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information packsswb mnemonic opcode description packsswb mmreg1, mmreg2/mem64 0f 63h pack with saturation signed 16-bit operands into signed 8-bit results privilege: none registers affected: mmx flags affected: none exceptions generated: the packsswb instruction performs a pack and saturate operation on four signed 16-bit values in the first operand and four signed 16-bit values in the second operand. the eight signed 8-bit results are placed in the specified mmx register. the pack operation is a data conversion. the packsswb instruction converts or packs the eight signed 16-bit values into eight signed 8-bit values, applying saturating arithmetic. if the signed 16- bit value is less than C128 (80h), it saturates to C128 (80h). if the signed 16-bit value is greater than 127 (7fh), it saturates to 127 (7fh). all values between C128 and 127 are represented by their signed 8-bit value. the first operand must be an mmx register. in addition to providing the first operand, this mmx register is the location where the result of the pack and saturate operation is stored. the second operand can be an mmx register or a 64-bit memory location. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
24 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the packsswb instruction the following list explains the functional illustration of the packsswb instruction: n bits 63C48 of the source operand (mmreg2/mem64) are packed into bits 63C56 of the destination operand (mmreg1). the result is not saturated because the 16-bit positive source operand (007eh) does not exceed the capacity of a signed 8-bit destination operand. n bits 47C32 of the source operand are packed into bits 55C48 of the destination operand. the result is saturated to the largest possible 8-bit positive number because the 16-bit positive source operand (7f00h) exceeds the capacity of a signed 8-bit destination operand. n bits 31C16 of the source operand are packed into bits 47C40 of the destination operand. the result is saturated to the largest possible 8-bit negative number because the 16-bit negative source operand (ef9dh) exceeds the capacity of a signed 8-bit destination operand. n bits 15C0 of the source operand are packed into bits 39C32 of the destination operand. the result is not saturated because the 16-bit negative source operand (ff88h) does not exceed the capacity of the 8-bit destination operand. n bits 63C48 of the destination operand are packed into bits 31C24 of the destination operand. the result is saturated to the largest possible 8-bit negative number because the 16-bit negative source operand (ff02h) exceeds the capacity of a signed 8-bit destination operand. 00 mmreg1 mmreg2/mem64 mmreg1 0 0 0 63 63 63 7eh 31 32 31 32 31 32 47 48 15 16 47 48 15 16 47 48 15 16 7 8 23 24 39 40 55 56 7f 00h ef 9dh ff 88h ff 02h 00 85h 00 7eh 81 cfh 7eh 80h 80h 7eh 7fh 88h 7fh 80h indicates a saturated value
mmx? instruction set 25 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information n bits 47C32 of the destination operand are packed into bits 23C16 of the destination operand. the result is saturated to the largest possible 8-bit positive number because the 16-bit positive source operand (0085h) exceeds the capacity of a signed 8-bit destination operand. n bits 31C16 of the destination operand are packed into bits 15C8 of the destination operand. the result is not saturated because the 16-bit positive source operand (007eh) does not exceed the capacity of a signed 8-bit destination operand. n bits 15C0 of the destination operand are packed into bits 7C0 of the destination operand. the result is saturated to the largest possible 8-bit negative number because the 16-bit negative source operand (81cfh) exceeds the capacity of a signed 8-bit destination operand. related instructions see the packssdw instruction. see the packuswb instruction. see the punpckhbw instruction. see the punpcklbw instruction.
26 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information packuswb mnemonic opcode description packuswb mmreg1, mmreg2/mem64 0f 67h pack with saturation signed16-bit operands into unsigned 8-bit results privilege: none registers affected: mmx flags affected: none exceptions generated: the packuswb instruction performs a pack and saturate operation on four signed 16-bit values in the first operand and four signed 16-bit values in the second operand. the eight unsigned 8-bit results are placed in the specified mmx register. the pack operation is a data conversion. the packuswb instruction converts or packs the eight signed 16-bit values into eight unsigned 8-bit values, applying saturating arithmetic. if the signed 16-bit value is a negative number, it saturates to 0 (00h). if the signed 16-bit value is greater than 255 (ffh), it saturates to 255 (ffh). all values between 0 and 255 are represented with their unsigned 8-bit value. the first operand must be an mmx register. in addition to providing the first operand, this mmx register is the location where the result of the pack and saturate operation is stored. the second operand can be an mmx register or a 64-bit memory location. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
mmx? instruction set 27 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information functional illustration of the packuswb instruction the following list explains the functional illustration of the packuswb instruction: n bits 63C48 of the source operand (mmreg2/mem64) are packed into bits 63C56 of the destination operand (mmreg1). the result is saturated to the largest possible 8-bit positive number because the 16-bit positive source operand (0112h) exceeds the capacity of an unsigned 8-bit destination operand. n bits 47C32 of the source operand are packed into bits 55C48 of the destination operand. the result is not saturated because the 16-bit positive source operand (008bh) does not exceed the capacity of an unsigned 8-bit destination operand. n bits 31C16 of the source operand are packed into bits 47C40 of the destination operand. the result is saturated to the largest possible 8-bit positive number because the 16-bit positive source operand exceeds the capacity of an unsigned 8-bit destination operand. n bits 15C0 of the source operand are packed into bits 39C32 of the destination operand. the result is saturated to 00h because the source operand (ff88h) is a negative value. n bits 63C48 of the destination operand are packed into bits 31C24 of the destination operand (mmreg1). the result is not saturated because the 16-bit positive source operand (0002h) does not exceed the capacity of an unsigned 8-bit destination operand. n bits 47C32 of the destination operand are packed into bits 23C16 of the destination operand. the result is saturated to the largest possible 8-bit positive number 01 mmreg1 mmreg2/mem64 mmreg1 0 0 0 63 63 63 12h 31 32 31 32 31 32 47 48 15 16 47 48 15 16 47 48 15 16 7 8 23 24 39 40 55 56 00 8bh 0f 80h ff 88h 00 02h 02 3ah 00 7eh ff f8h ffh ffh 02h 7eh 8bh 00h ffh 00h indicates a saturated value (signed) (signed) (unsigned)
28 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information because the 16-bit positive source operand (023ah) exceeds the capacity of an unsigned 8-bit destination operand. n bits 31C16 of the destination operand are packed into bits 15C8 of the destination operand. the result is not saturated because the 16-bit positive source operand (007eh) does not exceed the capacity of an unsigned 8-bit destination operand. n bits 15C0 of the destination operand are packed into bits 7C0 of the destination operand. the result is saturated to 00h because the source operand (fff8h) is a negative value. related instructions see the packssdw instruction. see the packsswb instruction. see the punpckhbw instruction. see the punpcklbw instruction.
mmx? instruction set 29 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information paddb mnemonic opcode description paddb mmreg1, mmreg2/mem64 0f fch add unsigned packed 8-bit values privilege: none registers affected: mmx flags affected: none exceptions generated: the paddb instruction adds eight unsigned 8-bit values from the source operand (an mmx register or a 64-bit memory location) to the eight corresponding unsigned 8-bit values in the destination operand (an mmx register). if any of the eight results is greater than the capacity of its 8-bit destination, the value wraps around with no carry into the next location. the eight 8-bit results are stored in the mmx register that is specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
30 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the paddb instruction the following list explains the functional illustration of the paddb instruction: n the value 53h is added to ech and wraps around to 3fh. n the value fch is added to 14h and wraps around to 10h. n the remaining addition operations are simple unsigned operations with no wraparound. related instructions see the paddd instruction. see the paddw instruction. see the paddsb instruction. see the paddsw instruction. see the paddusb instruction. see the paddusw instruction. ++++++ + + ====== = = 63 0 63 0 63 0 mmreg2/mem64 mmreg1 mmreg1 00h 00h 00h e2h 00h d0h 12h 1ah 07h f7h feh 10h fch 14h 3fh ech 53h 42h 42h fah 08h f2h 08h 22h
mmx? instruction set 31 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information paddd mnemonic opcode description paddd mmreg1, mmreg2/mem64 0f feh add unsigned packed 32-bit values privilege: none registers affected: mmx flags affected: none exceptions generated: the paddd instruction adds two unsigned 32-bit values from the source operand (an mmx register or a 64-bit memory location) to the two corresponding unsigned 32-bit values in the destination operand (an mmx register). if any of the two results is greater than the capacity of its 32-bit destination, the value wraps around with no carry into the next location. the two 32-bit results are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
32 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the paddd instruction the following list explains the functional illustration of the paddd instruction: n the value fff0_5c43h is added to 000f_a3beh and wraps around to 0000_0001h. n the second addition is a simple unsigned add operation with no wraparound. related instructions see the paddb instruction. see the paddw instruction. see the paddsb instruction. see the paddsw instruction. ++ == mmreg2/mem64 mmreg1 mmreg1 0123_4567h 8000_0000h 8123_4567h 0000_0001h 000f_a3beh fff0_5c43h 63 0 63 63 0 0
mmx? instruction set 33 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information paddsb mnemonic opcode description paddsb mmreg1, mmreg2/mem64 0f ech add signed packed 8-bit values and saturate privilege: none registers affected: mmx flags affected: none exceptions generated: the paddsb instruction adds eight signed 8-bit values from the source operand (an mmx register or a 64-bit memory location) to the eight corresponding signed 8-bit values in the destination operand (an mmx register). if the sum of any two 8-bit values is less than C128 (80h), it saturates to C128 (80h). if the sum of any two 8-bit values is greater than 127 (7fh), it saturates to 127 (7fh). the eight signed 8-bit results are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
34 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the paddsb instruction the following list explains the functional illustration of the paddsb instruction: n the signed 8-bit positive value 00h is added to the signed 8-bit positive value 01h with a signed 8-bit positive result of 01h. n the signed 8- bit negative value d2h (C46) is added to the signed 8-bit negative value 88h (C120) and saturates to 80h (C128), the largest possible signed 8-bit negative value. n the signed 8-bit positive value 53h (+83) is added to the signed 8-bit negative value ech (C20) with a signed 8-bit positive result of 3fh (+63). n the signed 8-bit positive value 42h is added to the signed 8-bit positive value 00h with a signed 8-bit positive result of 42h. n the signed 8-bit positive value 77h (+119) is added to the signed 8-bit positive value 14h (+20) and saturates to 7fh (+127), the largest possible positive value. n the signed 8-bit positive value 70h (+112) is added to the signed 8-bit positive value 44h (+68) and saturates to 7fh (+127), the largest possible positive value. n the signed 8-bit positive value 07h (+7) is added to the signed 8-bit negative value f7h (C9) with a signed 8-bit negative result of feh (C2). n the signed 8-bit negative value 9ah (C102) is added to the signed 8-bit negative value a8h (C88) and saturates to 80h (C128), the largest possible signed 8-bit negative value. related instructions see the paddb instruction. see the paddd instruction. see the paddw instruction. see the paddsw instruction. ++++++ + + ====== = = mmreg2/mem64 mmreg1 mmreg1 00h 01h 01h 9ah a8h 80h 70h 07h f7h feh 44h 7fh 77h 14h 7fh 42h 00h 42h 53h ech 3fh d2h 88h 80h 63 63 63 0 0 0 indicates a saturated value
mmx? instruction set 35 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information paddsw mnemonic opcode description paddsw mmreg1, mmreg2/mem64 0f edh add signed packed 16-bit values and saturate privilege: none registers affected: mmx flags affected: none exceptions generated: the paddsw instruction adds four signed 16-bit values from the source operand (an mmx register or a 64-bit memory location) to the four corresponding signed 16-bit values in the destination operand (an mmx register). if the sum of any two 16-bit values is less than C32768 (8000h), it saturates to C32768 (8000h). if the sum of any two 16-bit values is greater than 32767 (7fffh), it saturates to 32767 (7fffh). the four signed 16-bit results are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
36 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the paddsw instruction the following list explains the functional illustration of the paddsw instruction: n the signed 16- bit negative value d250h (C11696) is added to the signed 16-bit negative value 8807h (C30713) and saturates to 8000h (C32768), the largest possible signed 16-bit negative value. n the signed 16-bit positive value 5321h (+21281) is added to the signed 16-bit negative value ec22h (C5086) with a signed 16-bit positive result of 3f43h (+16195). n the signed 16-bit positive value 7007h (+28679) is added to the signed 16-bit positive value 0ff9h (+4089) and saturates to 7fffh (+32767), the largest possible positive value. n the signed 16-bit negative value ffffh (C1) is added to the signed 16-bit negative value ffffh (C1) with the negative 16-bit result of fffeh (C2). related instructions see the paddb instruction. see the paddd instruction. see the paddw instruction. see the paddsb instruction. see the paddusb instruction. see the paddusw instruction. ++++ ==== mmreg2/mem64 mmreg1 mmreg1 ffffh d250h ffffh fffeh 7007h 0ff9h 7fffh 5321h ec22h 3f43h 8807h 8000h 63 63 63 0 0 0 indicates a saturated value
mmx? instruction set 37 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information paddusb mnemonic opcode description paddusb mmreg1, mmreg2/mem64 0f dch add unsigned packed 8-bit values and saturate privilege: none registers affected: mmx flags affected: none exceptions generated: the paddusb instruction adds eight unsigned 8-bit values from the source operand (an mmx register or a 64-bit memory location) to the eight corresponding unsigned 8-bit values in the destination operand (an mmx register). the eight unsigned 8-bit results are stored in the mmx register specified as the destination operand. if the sum of any two unsigned 8-bit values is greater than 255 (ffh), it saturates to 255 (ffh). exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
38 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the paddusb instruction the following list explains the functional illustration of the paddusb instruction: n the sum of 7fh and 81h is 100h. this value is greater than ffh, so the result saturates to ffh. n the sum of d2h and 88h is 15ah. this value is greater than ffh, so the result saturates to ffh. n the sum of 53h and ech is 13fh. this value is greater than ffh, so the result saturates to ffh. n the sum of 42h and 0eh is 50h. this value is not greater than ffh, so the result does not saturate. n the sum of 77h and 14h is 8bh. this value is not greater than ffh, so the result does not saturate. n the sum of 70h and 44h is b4h. this value is not greater than ffh, so the result does not saturate. n the sum of 07h and f7h is feh. this value is not greater than ffh, so the result does not saturate. n the sum of 9ah and a8h is 142h. this value is greater than ffh, so the result saturates to ffh. related instructions see the paddb instruction. see the paddd instruction. see the paddw instruction. see the paddsb instruction. see the paddsw instruction. see the paddusw instruction. 7fh d2h 53h 42h 9ah 77h 70h 07h mmreg2/mem64 mmreg1 ++++++ + + ====== = = ffh ffh ffh 50h ffh 8bh b4h feh mmreg1 81h 88h ech 0eh a8h 14h 44h f7h 0 63 0 63 0 63 indicates a saturated value
mmx? instruction set 39 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information paddusw mnemonic opcode description paddusw mmreg1, mmreg2/mem64 0f ddh add unsigned packed 16-bit values and saturate privilege: none registers affected: mmx flags affected: none exceptions generated: the paddusw instruction adds four unsigned 16-bit values from the source operand (an mmx register or a 64-bit memory location) to the four corresponding unsigned 16-bit values in the destination operand (an mmx register). the four unsigned 16-bit results are stored in the mmx register specified as the destination operand. if the sum of any two unsigned 16-bit values is greater than 65,535 (ffffh), it saturates to 65,535 (ffffh). exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
40 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the paddusw instruction the following list explains the functional illustration of the paddusw instruction: n the sum of 7e10h and 7000h is ee10h. this value is not greater than ffffh, so the result does not saturate. n the sum of 8000h and 8000h is 10000h. this value is greater than ffffh, so the result saturates to ffffh. n the sum of fffeh and 0015h is 10013h. this value is greater than ffffh, so the result saturates to ffffh. n the sum of 1234h and 4567h is 579bh. this value is not greater than ffffh, so the result does not saturate. related instructions see the paddb instruction. see the paddd instruction. see the paddw instruction. see the paddsb instruction. see the paddsw instruction. see the paddusb instruction. 7e10h 8000h fffeh 1234h mmreg2/mem64 mmreg1 7000h 8000h 0015h 4567h ee10h ffffh ffffh 579bh mmreg1 ++++ ==== 0 63 0 63 0 63 indicates a saturated value
mmx? instruction set 41 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information paddw mnemonic opcode description paddw mmreg1, mmreg2/mem64 0f fdh add unsigned packed 16-bit values privilege: none registers affected: mmx flags affected: none exceptions generated: the paddw instruction adds four unsigned 16-bit values from the source operand (an mmx register or a 64-bit memory location) to the four corresponding unsigned 16-bit values in the destination operand (an mmx register). if any of the four results is greater than the capacity of its 16-bit destination, the value wraps around with no carry into the next location. the four 16-bit results are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
42 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the paddw instruction the following list explains the functional illustration of the paddw instruction: n the value 8000h is added to 0123h with a normal unsigned result of 8123h. n the value ff00h is added to 01ech and wraps around to 00ech. n the value 00fch is added to 8014h with a normal signed result of 8110h. n the value ffffh is added to ffffh and wraps around to fffeh. related instructions see the paddb instruction. see the paddd instruction. see the paddsb instruction. see the paddsw instruction. see the paddusb instruction. see the paddusw instruction. 63 ++++ ==== mmreg2/mem64 mmreg1 mmreg1 ffffh ffffh fffeh 00fch 8014h 8110h ff00h 01ech 8000h 0123h 8123h 00ech 63 63 0 0 0
mmx? instruction set 43 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information pand mnemonic opcode description pand mmreg1, mmreg2/mem64 0f dbh and 64-bit values privilege: none registers affected: mmx flags affected: none exceptions generated: the pand instruction operates on the 64-bit source and destination operands to complete a bitwise logical and. the results are stored in the destination operand. if the corresponding bits in the source and destination operands both equal 1, the resulting bit is 1 in the destination. if either bit in the source or destination operands equals 0, the resulting bit is 0 in the destination. the pand instruction can be used to extract operands from packed fields based on the masks that are produced by the compare instructionspcmpeq and pcmpgt. this technique can eliminate branch prediction overhead in mmx routines. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
44 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the pand instruction related instructions see the pandn instruction. see the por instruction. see the pxor instruction. 1010_1111_0000_1101 0000_1111_0000_1111 1100_0001_0011_0001 1000_1100_1101_0011 0101_1100_1100_0011 1100_1101_0100_1110 1011_0001_0011_1001 0110_0011_0101_1001 0000_1100_0000_0001 0000_1101_0000_1110 1000_0001_0011_0001 0000_0000_0101_0001 mmreg1 0 32 31 63 47 48 15 16 0 32 31 63 47 48 15 16 mmreg2/mem64 logical and logical and logical and logical and mmreg1 0 32 31 63 47 48 15 16 result
mmx? instruction set 45 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information pandn mnemonic opcode description pandn mmreg1, mmreg2/mem64 0f dfh invert a 64-bit value, then and the inverted value and a 64-bit value in memory or an mmx register privilege: none registers affected: mmx flags affected: none exceptions generated: the pandn instruction first operates on the 64-bit destination operand (an mmx register) to complete a bitwise logical not, inverting each bit. this operation changes 1 bits to 0 bits and 0 bits to 1 bits, storing the results in the destination operand. the inverted 64- bit destination operand is then logically andd with the 64-bit source operand (an mmx register or a 64-bit memory operand) to complete the pandn operation. if corresponding bits in the source operand and the inverted destination operand are both 1, the resulting bit is 1 in the destination. if either bit in the source operand or the inverted destination operand is 0, the resulting bit is 0 in the destination. the pandn instruction can be used to extract alternate operands from packed fields based on the inverse of the masks that are produced by the compare instructions pcmpeq and pcmpgt. this technique can eliminate branch prediction overhead in mmx routines. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
46 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the pandn instruction related instructions see the pand instruction. see the por instruction. see the pxor instruction. 1010_1111_0000_1101 0000_1111_0000_1111 1100_0001_0011_0001 1000_1100_1101_0011 0101_1100_1100_0011 1100_1101_0100_1110 1011_0001_0011_1001 0110_0011_0101_1001 0101_0000_1111_0010 1111_0000_1111_0000 0011_1110_1100_1110 0111_0011_0010_1100 0101_0000_1100_0010 1100_0000_0100_0000 0011_0000_0000_1000 0110_0011_0000_1000 mmreg1 0 32 31 63 47 48 15 16 mmreg1 0 32 31 63 47 48 15 16 invert invert invert invert 0 32 31 63 47 48 15 16 mmreg2/mem64 logical and logical and logical and logical and mmreg1 0 32 31 63 47 48 15 16 result
mmx? instruction set 47 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information pcmpeqb mnemonic opcode description pcmpeqb mmreg1, mmreg2/mem64 0f 74h compare packed 8-bit values for equality privilege: none registers affected: mmx flags affected: none exceptions generated: the pcmpeqb instruction operates on 8-bit data values. the instruction compares two 8-bit values to determine if they are equal. if the corresponding bits in the two operands are equal, all the bits in that 8 bits of the destination operand are set to 1. if any of the corresponding bits in the two operands are not equal, all the bits in that 8 bits of the destination operand are set to 0. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
48 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the pcmpeqb instruction related instructions see the pcmpeqd instruction. see the pcmpeqw instruction. see the pcmpgtb instruction. see the pcmpgtd instruction. see the pcmpgtw instruction. dbh 15h 43h ffh mmreg2/mem64 mmreg1 mmreg1 compare compare compare compare false true true false 31 63 80h ceh a1h 04h compare compare compare compare 0 32 ddh 15h 42h ffh 31 63 80h eeh a1h 14h 0 32 00h ffh 00h ffh 31 63 00h ffh ffh 00h 0 32 result result result result result result result result false true false true
mmx? instruction set 49 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information pcmpeqd mnemonic opcode description pcmpeqd mmreg1, mmreg2/mem64 0f 76h compare packed 32-bit values for equality privilege: none registers affected: mmx flags affected: none exceptions generated: the pcmpeqd instruction operates on 32-bit data values. the instruction compares two 32-bit values to determine if they are equal. if the corresponding bits in the two operands are equal, all the bits in that 32 bits of the destination operand are set to 1. if any of the corresponding bits in the two operands are not equal, all the bits in that 32 bits of the destination operand are set to 0. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
50 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the pcmpeqd instruction related instructions see the pcmpeqb instruction. see the pcmpeqw instruction. see the pcmpgtb instruction. see the pcmpgtd instruction. see the pcmpgtw instruction. 0000ba14h ef031243h mmreg2/mem64 mmreg1 0000ba13h ef031243h 00000000h ffffffffh mmreg1 compare compare result result false true 0 63 0 63 0 63
mmx? instruction set 51 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information pcmpeqw mnemonic opcode description pcmpeqw mmreg1, mmreg2/mem64 0f 75h compare packed 16-bit values for equality privilege: none registers affected: mmx flags affected: none exceptions generated the pcmpeqw instruction operates on 16-bit data values. the instruction compares two 16-bit values to determine if they are equal. if the corresponding bits in the two operands are equal, all the bits in that 16 bits of the destination operand are set to 1. if any of the corresponding bits in the two operands are not equal, all the bits in that 16 bits of the destination operand are set to 0. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
52 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the pcmpeqw instruction related instructions see the pcmpeqb instruction. see the pcmpeqd instruction. see the pcmpgtb instruction. see the pcmpgtd instruction. see the pcmpgtw instruction. da14h 8000h 1243h 1234h mmreg2/mem64 mmreg1 da24h 8000h 1243h 1243h 0000h ffffh ffffh 0000h mmreg1 compare compare compare compare result result result result false true true false 0 63 0 63 0 63
mmx? instruction set 53 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information pcmpgtb mnemonic opcode description pcmpgtb mmreg1, mmreg2/mem64 0f 64h compare signed packed 8-bit values for magnitude privilege: none registers affected: mmx flags affected: none exceptions generated: the pcmpgtb instruction operates on signed 8-bit data values. the instruction compares two signed 8-bit values to determine if the value in the destination operand is greater than the corresponding signed 8-bit data value in the source operand. if the value in the destination operand is greater than the value in the source operand, all the bits in that 8 bits of the destination operand are set to 1. if the value in the destination operand is equal to or less than the value in the source operand, all the bits in that 8 bits of the destination operand are set to 0. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
54 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the pcmpgtb instruction the following list explains the functional illustration of the pcmpgtb instruction: n the negative value ddh (C35) is greater than the negative value dch (C36), so the result is true (ffh). n the positive value 24h (+36) is not greater than the positive value 25h (+37), so the result is false (00h). n the positive value 42h (+66) is greater than the positive value 41h (+65), so the result is true (ffh). n the positive value 01h (+1) is greater than the negative value ffh (C1), so the result is true (ffh). n the negative value 80h (C128) is not greater than the negative value 80h (C128), so the result is false (00h). n the negative value 80h (C128) is not greater than the positive value 7fh (+127), so the result is false (00h). n the negative value a3h (C93) is not greater than the negative value a6h (C90), so the result is false (00h). n the positive value 14h (+20) is greater than the positive value 04h (+4), so the result is true (ffh). related instructions see the pcmpeqb instruction. see the pcmpeqd instruction. see the pcmpeqw instruction. see the pcmpgtd instruction. see the pcmpgtw instruction. dch 25h 41h ffh mmreg2/mem64 mmreg1 mmreg1 greater? greater? greater? greater? false true true false 31 63 80h 7fh a6h 04h greater? greater? greater? greater? 0 32 ddh 24h 42h 01h 31 63 80h 80h a3h 14h 0 32 ffh 00h ffh ffh 31 63 00h 00h 00h ffh 0 32 result result result result result result result result false true false true
mmx? instruction set 55 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information pcmpgtd mnemonic opcode description pcmpgtd mmreg1, mmreg2/mem64 0f 66h compare signed packed 32-bit values for magnitude privilege: none registers affected: mmx flags affected: none exceptions generated the pcmpgtb instruction operates on signed 32-bit data values. the instruction compares two signed 32-bit values to determine if the value in the destination operand is greater than the corresponding signed 32-bit data value in the source operand. if the value in the destination operand is greater than the value in the source operand, all the bits in that 32 bits of the destination operand are set to 1. if the value in the destination operand is equal to or less than the value in the source operand, all the bits in that 32 bits of the destination operand are set to 0. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
56 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the pcmpgtd instruction the following list explains the functional illustration of the pcmpgtd instruction: n the positive value 0000_ba15h (+47637) is greater than the positive value 0000_ba14h (+47636), so the result is true (ffff_ffffh). n the positive value 0000_0001h (+1) is greater than the negative value ffff_ffffh (C1), so the result is true (ffff_ffffh). related instructions see the pcmpeqb instruction. see the pcmpeqd instruction. see the pcmpeqw instruction. see the pcmpgtb instruction. see the pcmpgtw instruction. 0000_ba14h ffff_ffffh mmreg2/mem64 mmreg1 0000_ba15h 0000_0000h ffff_ffffh ffff_ffffh mmreg1 greater? greater? result result true true 0 63 0 63 0 63
mmx? instruction set 57 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information pcmpgtw mnemonic opcode description pcmpgtw mmreg1, mmreg2/mem64 0f 65h compare signed packed 16-bit values for magnitude privilege: none registers affected: mmx flags affected: none exceptions generated: the pcmpgtw instruction operates on signed 16-bit data values. the instruction compares two signed 16-bit values to determine if the value in the destination operand is greater than the corresponding signed 16-bit data value in the source operand. if the value in the destination operand is greater than the value in the source operand, all the bits in that 16 bits of the destination operand are set to 1. if the value in the destination operand is equal to or less than the value in the source operand, all the bits in that 16 bits of the destination operand are set to 0. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
58 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the pcmpgtw instruction the following list explains the functional illustration of the pcmpgtb instruction: n the negative value da14h (C9708) is not greater than the positive value 0001h (+1), so the result is false (0000h). n the negative value 8000h (C32768) is not greater than the negative value 8000h (C32768), so the result is false (0000h). n the positive value 0001h (+1) is greater than the negative value ffffh (C1), so the result is true (ffffh). n the positive value 1243h (+4675) is greater than the positive value 1234h (+4660), so the result is true (ffffh). related instructions see the pcmpeqb instruction. see the pcmpeqd instruction. see the pcmpeqw instruction. see the pcmpgtb instruction. see the pcmpgtd instruction. 0001h 8000h ffffh 1234h mmreg2/mem64 mmreg1 da14h 8000h 0001h 1243h 0000h 0000h ffffh ffffh mmreg1 greater? greater? greater? greater? result result result result false false true true 0 63 0 63 0 63
mmx? instruction set 59 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information pmaddwd mnemonic opcode description pmaddwd mmreg1, mmreg2/mem64 0f f5h multiply signed packed 16-bit values and add the 32-bit results privilege: none registers affected: mmx flags affected: none exceptions generated: the pmaddwd instruction multiplies signed 16-bit values from the source operand (an mmx register or a 64-bit memory location) by the corresponding signed 16-bit values in the destination operand (an mmx register), adds the resulting 32-bit values from the left and right halves of the 64-bit work space, and stores the 32-bit sums in the mmx destination register. note: if all four of the 16-bit operands are 8000h, the result wraps around to 8000_0000h because the maximum negative 16-bit value of 8000h multiplied by itself equals 4000_0000h, and 4000_0000h added to 4000_0000h equals 8000_0000h. the result of multiplying two negative numbers should be a positive number, but 8000_0000h is the maximum possible 32-bit negative number rather than a positive number. this is the only instance of wraparound that can occur as a result of the pmaddwd instruction. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
60 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the pmaddwd instruction the following list explains the functional illustration of the pmaddwd instruction: n the signed 16- bit negative value fffeh (C2) is multiplied by the signed 16-bit positive value 0002h to produce a signed 32-bit negative intermediate result of ffff_fffch (C4). n the signed 16-bit positive value 7fffh is multiplied by the signed 16-bit positive value 7fffh to produce a signed 32-bit positive intermediate result of 3fff_0001h. n the two 32-bit intermediate results are added together to produce the final signed 32-bit positive result of 3ffe_fffdh. n the signed 16-bit positive value 7007h is multiplied by the signed 16-bit positive value 0ff9h to produce a signed 32-bit intermediate result of 06fd_5fcfh. n the signed 16-bit negative value ffffh (C1) is multiplied by the signed 16-bit negative value ffffh (C1) to produce a signed 32-bit positive intermediate result of 0000_0001h. n the two 32-bit intermediate results are added together to produce the final signed 32-bit positive result of 06fd_5fd0h. related instructions see the pmulhw instruction. see the pmullw instruction. 06fd_5fcfh 0000_0001h **** == = = mmreg2/mem64 mmreg1 ffffh 0ff9h 7fffh 0002h 63 0 mmreg1 06fd_5fd0h 63 0 3ffe_fffdh ffff_fffch 3fff_0001h 48 47 32 31 16 15 ffffh 7007h 7fffh fffeh 63 0 48 47 32 31 16 15 intermediate results ffff_fffch + = = + 0000_0001h
mmx? instruction set 61 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information pmulhw mnemonic opcode description pmulhw mmreg1, mmreg2/mem64 0f e5h multiply signed packed 16-bit values and store the high 16 bits privilege: none registers affected: mmx flags affected: none exceptions generated: the pmulhw instruction multiplies four signed 16-bit values from the source operand (an mmx register or a 64-bit memory location) by the four corresponding signed 16-bit values in the destination operand (an mmx register) and then stores the high-order 16 bits of the result (including the sign bit) in the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
62 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the pmulhw instruction the following list explains the functional illustration of the pmulhw instruction: n the signed 16- bit negative value d250h (C2db0h) is multiplied by the signed 16-bit negative value 8807h (C77f9h) to produce the signed 32-bit positive result of 1569_4030h. the signed high-order 16-bits of the result are stored in the destination operand. n the signed 16-bit positive value 5321h is multiplied by the signed 16-bit negative value ec22h (C13deh) to produce the signed 32-bit negative result of f98c_7662h (C0673_899eh). the signed high-order 16-bits of the result are stored in the destination operand. n the signed 16-bit positive value 7007h is multiplied by the signed 16-bit positive value 0ff9h to produce the signed 32-bit positive result of 06fd_5fcfh. the signed high-order 16-bits of the result are stored in the destination operand. n the signed 16-bit negative value ffffh (C1) is multiplied by the signed 16-bit negative value ffffh (C1) to produce the signed 32-bit positive result of 0000_0001h. the signed high-order 16-bits of the result are stored in the destination operand. related instructions see the pmaddwd instruction. see the pmullw instruction. see the punpckhwd instruction. see the punpcklwd instruction. **** ==== mmreg2/mem64 mmreg1 mmreg1 ffffh d250h ffffh 0000h 7007h 0ff9h 06fdh 5321h ec22h f98ch 8807h 1569h 63 63 63 0 0 0
mmx? instruction set 63 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information pmullw mnemonic opcode description pmullw mmreg1, mmreg2/mem64 0f d5h multiply signed packed 16-bit values and store the low 16 bits privilege: none registers affected: mmx flags affected: none exceptions generated: the pmullw instruction multiplies four signed 16-bit values from the source operand (an mmx register or a 64-bit memory location) by the four corresponding signed 16-bit values in the destination operand (an mmx register) and then stores the low-order 16 bits of the result (unsigned) in the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
64 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the pmullw instruction the following list explains the functional illustration of the pmullw instruction: n the signed 16- bit negative value d250h (C2db0h) is multiplied by the signed 16-bit negative value 8807h (C77f9h) to produce the signed 32-bit positive result of 1569_4030h. the unsigned low-order 16-bits of the result are stored in the destination operand. n the signed 16-bit positive value 5321h is multiplied by the signed 16-bit negative value ec22h (C13deh) to produce the signed 32-bit negative result of f98c_7662h (C0673_899eh). the unsigned low-order 16-bits of the result are stored in the destination operand. n the signed 16-bit positive value 7007h is multiplied by the signed 16-bit positive value 0ff9h to produce the signed 32-bit positive result of 06fd_5fcfh. the unsigned low-order 16-bits of the result are stored in the destination operand. n the signed 16-bit negative value ffffh (C1) is multiplied by the signed 16-bit negative value ffffh (C1) to produce the signed 32-bit positive result of 0000_0001h. the unsigned low-order 16-bits of the result are stored in the destination operand. related instructions see the pmaddwd instruction. see the pmulhw instruction. see the punpckhwd instruction. see the punpcklwd instruction. ==== mmreg2/mem64 mmreg1 mmreg1 ffffh d250h ffffh 0001h 7007h 0ff9h 5fcfh 5321h ec22h 7662h 8807h 4030h 63 63 63 0 0 0 ****
mmx? instruction set 65 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information por mnemonic opcode description por mmreg1, mmreg2/mem64 0f ebh or 64-bit values privilege: none registers affected: mmx flags affected: none exceptions generated: the por instruction logically ors the 64 bits of the source operand (an mmx register or a 64-bit memory location) with the 64 bits of the destination operand (an mmx register) and stores the result in the destination register. a logical or produces a 1 bit if either or both input bits is a 1. if both input bits are 0, a logical or produces a 0 bit. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
66 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the por instruction in the functional illustration of the por instruction, the 64-bit source value is logically ord to the 64-bit destination value, and the result is stored in the destination register. related instructions see the pand instruction. see the pandn instruction. see the pxor instruction. 0101_1100_1100_0011 1100_1101_0100_1110 1011_0001_0011_1001 0110_0011_0101_1001 1111_1111_1100_1111 1100_1111_0100_1111 1111_0001_0011_1001 1110_1111_1101_1011 mmreg1 0 32 31 63 47 48 15 16 0 32 31 63 47 48 15 16 mmreg2/mem64 logical or logical or logical or logical or mmreg1 0 32 31 63 47 48 15 16 result 1010_1111_0000_1101 0000_1111_0000_1111 1100_0001_0011_0001 1000_1100_1101_0011
mmx? instruction set 67 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information pslld mnemonic opcode description pslld mmreg1, mmreg2/mem64 0f f2h shift left logical packed 32-bit values in mmreg1 the number of positions in mmreg2/mem64 with zero fill from the right pslld mmreg1, imm8 0f 72h /6 shift left logical packed 32-bit values in mmreg1 the number of positions in imm8 with zero fill from the right privilege: none registers affected: mmx flags affected: none exceptions generated: the pslld instruction shifts the two 32-bit operands in the destination operand (an mmx register) to the left by the number of bit positions indicated by mmreg2/mem64 or by imm8, the 8-bit immediate operand. the shifted values are zero filled from the right. the two 32-bit results are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
68 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the pslld instruction the following list explains the functional illustration of the pslld instruction: n the value 0000_0000_0000_0008h in mmreg2/mem64 indicates a shift of 8 bit positions to the left. n the 32-bit value 000f_a3beh in mmreg1 is shifted 8 bit positions to the left and stored in mmreg1 as 0fa3_be00h. n the 32-bit value 0123_4567h in mmreg1 is shifted 8 bit positions to the left and stored in mmreg1 as 2345_6700h. related instructions see the psllq instruction. see the psllw instruction. see the psrad instruction. see the psraw instruction. see the psrld instruction. see the psrlq instruction. see the psrlw instruction. == mmreg2/mem64 mmreg1 mmreg1 2345_6700h 0fa3_be00h 000f_a3beh 0000_0000_0000_0008h 63 0 63 63 0 0 0123_4567h
mmx? instruction set 69 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information psllq mnemonic opcode description psllq mmreg1, mmreg2/mem64 0f f3h shift left logical 64-bit values in mmreg1 the number of positions in mmreg2/mem64 with zero fill from the right psllq mmreg1, imm8 0f 73h /6 shift left logical 64-bit values in mmreg1 the number of positions in imm8 with zero fill from the right privilege: none registers affected: mmx flags affected: none exceptions generated: the psllq instruction shifts the 64-bit operand in the destination operand (an mmx register) to the left by the number of bit positions indicated by mmreg2/mem64 or by imm8, the 8-bit immediate operand. the shifted value is zero filled from the right. the 64-bit result is stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
70 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the psllq instruction the following list explains the functional illustration of the psllq instruction: n the value 0000_0000_0000_0008h in mmreg2/mem64 indicates a shift of 8 bit positions to the left. n the 64-bit value 000f_a3be_0123_4567h in mmreg1 is shifted 8 bit positions to the left and stored in mmreg1 as 0fa3_be01_2345_6700h. related instructions see the pslld instruction. see the psllw instruction. see the psrad instruction. see the psraw instruction. see the psrld instruction. see the psrlq instruction. see the psrlw instruction. = mmreg2/mem64 mmreg1 mmreg1 0fa3_be01_2345_6700h 000f_a3be_0123_4567h 0000_0000_0000_0008h 63 0 63 63 0 0
mmx? instruction set 71 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information psllw mnemonic opcode description psllw mmreg1, mmreg2/mem64 0f f1h shift left logical packed 16-bit values in mmreg1 the number of positions in mmreg2/mem64 with zero fill from the right psllw mmreg1, imm8 0f 71h /6 shift left logical packed 16-bit values in mmreg1 the number of positions in imm8 with zero fill from the right privilege: none registers affected: mmx flags affected: none exceptions generated: the psllw instruction shifts the four 16-bit operands in the destination operand (an mmx register) to the left by the number of bit positions indicated by mmreg2/mem64 or by imm8, the 8-bit immediate operand. the shifted values are zero filled from the right. the four 16-bit results are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
72 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the psllw instruction the following list explains the functional illustration of the psllw instruction: n the value 0000_0000_0000_0008h in mmreg2/mem64 indicates a shift of 8 bit positions to the left. n the 16-bit value 8807h in mmreg1 is shifted 8 bit positions to the left and stored in mmreg1 as 0700h. n the 16-bit value ec22h in mmreg1 is shifted 8 bit positions to the left and stored in mmreg1 as 2200h. n the 16-bit value 0ff9h in mmreg1 is shifted 8 bit positions to the left and stored in mmreg1 as f900h. n the 16-bit value ffffh in mmreg1 is shifted 8 bit positions to the left and stored in mmreg1 as ff00h. related instructions see the pslld instruction. see the psllq instruction. see the psrad instruction. see the psraw instruction. see the psrld instruction. see the psrlq instruction. see the psrlw instruction. ==== mmreg2/mem64 mmreg1 mmreg1 0000_0000_0000_0008h ffffh ff00h 0ff9h f900h ec22h 2200h 8807h 0700h 63 63 63 0 0 0
mmx? instruction set 73 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information psrad mnemonic opcode description psrad mmreg1, mmreg2/mem64 0f e2h shift right arithmetic packed signed 32-bit values in mmreg1 the number of positions in mmreg2/mem64 with sign fill from the left psrad mmreg1, imm8 0f 72h /4 shift right arithmetic packed signed 32-bit values in mmreg1 the number of positions in imm8 with sign fill from the left privilege: none registers affected: mmx flags affected: none exceptions generated: the psrad instruction shifts the two signed 32-bit operands in the destination operand (an mmx register) to the right by the number of bit positions indicated by mmreg2/mem64 or by imm8, the 8-bit immediate operand. the shifted values are sign filled from the left. the two signed 32-bit results are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
74 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the psrad instruction the following list explains the functional illustration of the psrad instruction: n the value 0000_0000_0000_0010h in mmreg2/mem64 indicates a shift of 16 bit positions to the right. n the 32-bit negative value fff0_0000h in mmreg1 is shifted 16 bit positions to the right with sign fill from the left and stored in mmreg1 as ffff_fff0h. n the 32-bit positive value 0123_0000h in mmreg1 is shifted 16 bit positions to the right with sign fill from the left and stored in mmreg1 as 0000_0123h. related instructions see the pslld instruction. see the psllq instruction. see the psllw instruction. see the psraw instruction. see the psrld instruction. see the psrlq instruction. see the psrlw instruction. see the punpckhwd instruction. see the punpcklwd instruction. == mmreg2/mem64 mmreg1 mmreg1 0123_0000h 0000_0000_0000_0010h 0000_0123h ffff_fff0h fff0_0000h 63 0 63 63 0 0
mmx? instruction set 75 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information psraw mnemonic opcode description psraw mmreg1, mmreg2/mem64 0f e1h shift right arithmetic packed signed 16-bit values in mmreg1 the number of positions in mmreg2/mem64 with sign fill from the left psraw mmreg1, imm8 0f 71h /4 shift right arithmetic packed signed 16-bit values in mmreg1 the number of positions in imm8 with sign fill from the left privilege: none registers affected: mmx flags affected: none exceptions generated: the psraw instruction shifts the four signed 16-bit operands in the destination operand (an mmx register) to the right by the number of bit positions indicated by mmreg2/mem64 or by imm8, the 8-bit immediate operand. the shifted values are sign filled from the left. the four signed 16-bit results are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
76 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the psraw instruction the following list explains the functional illustration of the psraw instruction: n the value 0000_0000_0000_0008h in mmreg2/mem64 indicates a shift of 8 bit positions to the right. n the 16-bit negative value 8800h in mmreg1 is shifted 8 bit positions to the right with sign fill from the left and stored in mmreg1 as ff88h. n the 16-bit negative value ec00h in mmreg1 is shifted 8 bit positions to the right with sign fill from the left and stored in mmreg1 as ffech. n the 16-bit positive value 0f00h in mmreg1 is shifted 8 bit positions to the right with sign fill from the left and stored in mmreg1 as 000fh. n the 16-bit positive value 7f00h in mmreg1 is shifted 8 bit positions to the right with sign fill from the left and stored in mmreg1 as 007fh. related instructions see the pslld instruction. see the psllq instruction. see the psllw instruction. see the psrad instruction. see the psrld instruction. see the psrlq instruction. see the psrlw instruction. see the punpckhbw instruction. see the punpcklbw instruction. ==== mmreg2/mem64 mmreg1 mmreg1 0000_0000_0000_0008h 7f00h 007fh 0f00h 000fh ec00h ffech 8800h ff88h 63 63 63 0 0 0
mmx? instruction set 77 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information psrld mnemonic opcode description psrld mmreg1, mmreg2/mem64 0f d2h shift right logical packed 32-bit values in mmreg1 the number of positions in mmreg2/mem64 with zero fill from the left psrld mmreg1, imm8 0f 72h /2 shift right logical packed 32-bit values in mmreg1 the number of positions in imm8 with zero fill from the left privilege: none registers affected: mmx flags affected: none exceptions generated: the psrld instruction shifts the two 32-bit operands in the destination operand (an mmx register) to the right by the number of bit positions indicated by mmreg2/mem64 or by imm8, the 8-bit immediate operand. the shifted values are zero filled from the left. the two 32-bit results are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
78 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the psrld instruction the following list explains the functional illustration of the psrld instruction: n the value 0000_0000_0000_0010h in mmreg2/mem64 indicates a shift of 16 bit positions to the right. n the 32-bit value fff0_0000h in mmreg1 is shifted 16 bit positions to the right and stored in mmreg1 as 0000_fff0h n the 32-bit value 0123_4567h in mmreg1 is shifted 16 bit positions to the right and stored in mmreg1 as 0000_0123h. related instructions see the pslld instruction. see the psllq instruction. see the psllw instruction. see the psrad instruction. see the psraw instruction. see the psrlq instruction. see the psrlw instruction. == mmreg2/mem64 mmreg1 mmreg1 0123_4567h 0000_0000_0000_0010h 0000_0123h 0000_fff0h fff0_0000h 63 0 63 63 0 0
mmx? instruction set 79 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information psrlq mnemonic opcode description psrlq mmreg1, mmreg2/mem64 0f d3h shift right logical 64-bit values in mmreg1 the number of positions in mmreg2/mem64 with zero fill from the left psrlq mmreg1, imm8 0f 73h /2 shift right logical 64-bit values in mmreg1 the number of positions in imm8 with zero fill from the left privilege: none registers affected: mmx flags affected: none exceptions generated: the psrlq instruction shifts the 64-bit operand in the destination operand (an mmx register) to the right by the number of bit positions indicated by mmreg2/mem64 or by imm8, the 8-bit immediate operand. the shifted value is zero filled from the left. the result is stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
80 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the psrlq instruction the following list explains the functional illustration of the psrlq instruction: n the value 0000_0000_0000_0010h in mmreg2/mem64 indicates a shift of 16 bit positions to the right. n the 64-bit value 000f_a3be_0123_4567h in mmreg1 is shifted 16 bit positions to the right and stored in mmreg1 as 0000_000f_a3be_0123h. related instructions see the pslld instruction. see the psllq instruction. see the psllw instruction. see the psrad instruction. see the psraw instruction. see the psrld instruction. see the psrlw instruction. = mmreg2/mem64 mmreg1 mmreg1 0000_000f_a3be_0123h 000f_a3be_0123_4567h 0000_0000_0000_0010h 63 0 63 63 0 0
mmx? instruction set 81 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information psrlw mnemonic opcode description psrlw mmreg1, mmreg2/mem64 0f d1h shift right logical packed 16-bit values in mmreg1 the number of positions in mmreg2/mem64 with zero fill from the left psrlw mmreg1, imm8 0f 71h /2 shift right logical packed 16-bit values in mmreg1 the number of positions in imm8 with zero fill from the left privilege: none registers affected: mmx flags affected: none exceptions generated: the psrlw instruction shifts the four 16-bit operands in the destination operand (an mmx register) to the right by the number of bit positions indicated by mmreg2/mem64 or by imm8, the 8-bit immediate operand. the shifted values are zero filled from the left. the four 16-bit results are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
82 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the psrlw instruction the following list explains the functional illustration of the psrlw instruction: n the value 0000_0000_0000_0008h in mmreg2/mem64 indicates a shift of 8 bit positions to the right. n the 16-bit value 8800h in mmreg1 is shifted 8 bit positions to the right and stored in mmreg1 as 0088h. n the 16-bit value ec22h in mmreg1 is shifted 8 bit positions to the right and stored in mmreg1 as 00ech. n the 16-bit value 0ff9h in mmreg1 is shifted 8 bit positions to the right and stored in mmreg1 as 000fh. n the 16-bit value ff00h in mmreg1 is shifted 8 bit positions to the right and stored in mmreg1 as 00ffh. related instructions see the pslld instruction. see the psllq instruction. see the psllw instruction. see the psrad instruction. see the psraw instruction. see the psrld instruction. see the psrlq instruction. ==== mmreg2/mem64 mmreg1 mmreg1 0000_0000_0000_0008h ff00h 00ffh 0ff9h 000fh ec22h 00ech 8800h 0088h 63 63 63 0 0 0
mmx? instruction set 83 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information psubb mnemonic opcode description psubb mmreg1, mmreg2/mem64 0f f8h subtract unsigned packed 8-bit values with wraparound privilege: none registers affected: mmx flags affected: none exceptions generated: the psubb instruction subtracts eight unsigned 8-bit values in the source operand (an mmx register or a 64-bit memory location) from the eight corresponding unsigned 8-bit values in the destination operand (an mmx register). if the source operand is larger than the destination operand, the result wraps around. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
84 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the psubb instruction the following list explains the functional illustration of the psubb instruction: n the unsigned 8-bit value ech is subtracted from the unsigned 8-bit value 53h and wraps around to 67h. n the unsigned 8-bit value f7h is subtracted from the unsigned 8-bit value 07h and wraps around to 10h. n the unsigned 8-bit value a8h is subtracted from the unsigned 8-bit value 9ah and wraps around to f2h. n all the remaining operations are simple subtraction with no wraparound. related instructions see the psubd instruction. see the psubw instruction. see the psubsb instruction. see the psubsw instruction. see the psubusb instruction. see the psubusw instruction. ------ - - ====== = = mmreg2/mem64 mmreg1 mmreg1 00h 00h 00h 9ah a8h f2h 70h 07h f7h 10h 44h 2ch 77h 14h 63h 42h 00h 42h 53h ech 67h d2h 88h 4ah 63 63 63 0 0 0
mmx? instruction set 85 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information psubd mnemonic opcode description psubd mmreg1, mmreg2/mem64 0f fah subtract unsigned packed 32-bit values with wraparound privilege: none registers affected: mmx flags affected: none exceptions generated: the psubd instruction subtracts two unsigned 32-bit values in the source operand (an mmx register or a 64-bit memory location) from the two corresponding unsigned 32-bit values in the destination operand (an mmx register). if the source operand is larger than the destination operand, the result wraps around. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
86 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the psubd instruction the following list explains the functional illustration of the psubd instruction: n the unsigned 32-bit value 8000_0000h is subtracted from the unsigned 32-bit value 0123_4567h and wraps around to 8123_4567h. n the remaining operation is a simple subtraction with no wraparound. related instructions see the psubb instruction. see the psubw instruction. see the psubsb instruction. see the psubsw instruction. see the psubusb instruction. see the psubusw instruction. -- == mmreg2/mem64 mmreg1 mmreg1 0123_4567h 8000_0000h 8123_4567h ffe0_b885h 000f_a3beh fff0_5c43h 63 0 63 63 0 0
mmx? instruction set 87 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information psubsb mnemonic opcode description psubsb mmreg1, mmreg2/mem64 0f e8h subtract signed packed 8-bit values and saturate privilege: none registers affected: mmx flags affected: none exceptions generated: the psubsb instruction subtracts eight signed 8-bit values in the source operand (an mmx register or a 64-bit memory location) from the eight corresponding signed 8-bit values in the destination operand (an mmx register). if a result is less than C128 (80h), it saturates to C128 (80h). if a result is greater than 127 (7fh), it saturates to 127 (7fh). the eight signed 8-bit results are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
88 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the psubsb instruction the following list explains the functional illustration of the psubsb instruction: n the signed 8-bit positive value 0fh is subtracted from the signed 8-bit negative value 82h, and the result saturates to 80h because it is less than 80h, the smallest possible signed 8-bit value. n the signed 8-bit negative value c1h is subtracted from the signed 8-bit positive value 42h, and the result saturates to 7fh because it is greater than 7fh, the largest possible signed 8-bit value. n all the remaining operations are simple signed subtraction with no saturation. related instructions see the psubb instruction. see the psubd instruction. see the psubw instruction. see the psubsw instruction. see the psubusb instruction. see the psubusw instruction. ------ - - ====== = = mmreg2/mem64 mmreg1 mmreg1 82h 0fh 80h 9ah a8h f2h 70h 07h f7h 10h 44h 2ch 77h 14h 63h 42h c1h 7fh 53h ech 67h d2h 88h 4ah 63 63 63 0 0 0 indicates a saturated value
mmx? instruction set 89 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information psubsw mnemonic opcode description psubsw mmreg1, mmreg2/mem64 0f e9h subtract signed packed 16-bit values and saturate privilege: none registers affected: mmx flags affected: none exceptions generated: the psubsw instruction subtracts four signed 16-bit values in the source operand (an mmx register or a 64-bit memory location) from the four corresponding signed 16-bit values in the destination operand (an mmx register). if a result is less than C32768 (8000h), it saturates to C32768 (8000h). if a result is greater than 32767 (7fffh), it saturates to 32767 (7fffh). the four signed 16-bit results are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
90 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the psubsw instruction the following list explains the functional illustration of the psubsw instruction: n the signed 16-bit negative value d320h is subtracted from the signed 16-bit positive value 5321h, and the result saturates to 7fffh because it is greater than 7fffh, the largest possible signed 16-bit value. n the signed 16-bit positive value 0ff9h is subtracted from the signed 16-bit negative value 8007h, and the result saturates to 8000h because it is less than 8000h, the smallest possible signed 16-bit value. n the remaining operations are simple signed subtraction with no saturation. related instructions see the psubb instruction. see the psubd instruction. see the psubw instruction. see the psubsb instruction. see the psubusb instruction. see the psubusw instruction. ---- ==== mmreg2/mem64 mmreg1 mmreg1 ffffh d250h ffffh 0000h 8007h 0ff9h 8000h 5321h d320h 7fffh 8807h 4a49h 63 63 63 0 0 0 indicates a saturated value
mmx? instruction set 91 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information psubusb mnemonic opcode description psubusb mmreg1, mmreg2/mem64 0f d8h subtract unsigned packed 8-bit values and saturate privilege: none registers affected: mmx flags affected: none exceptions generated: the psubusb instruction subtracts eight unsigned 8-bit values in the source operand (an mmx register or a 64-bit memory location) from the eight corresponding unsigned 8-bit values in the destination operand (an mmx register). if any 8-bit source value is greater than its corresponding 8-bit destination value, the result saturates to 00h. the eight unsigned 8-bit results are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
92 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the psubusb instruction the following list explains the functional illustration of the psubusb instruction: n the unsigned 8-bit value ech is subtracted from the unsigned 8-bit value 53h, and the result saturates to 00h because the source operand is greater than the destination operand. n the unsigned 8-bit value c1h is subtracted from the unsigned 8-bit value 42h, and the result saturates to 00h because the source operand is greater than the destination operand. n the unsigned 8-bit value f7h is subtracted from the unsigned 8-bit value 07h, and the result saturates to 00h because the source operand is greater than the destination operand. n all the remaining operations are simple unsigned subtraction with no saturation. related instructions see the psubb instruction. see the psubd instruction. see the psubw instruction. see the psubsb instruction. see the psubsw instruction. see the psubusw instruction. ------ - - ====== = = mmreg2/mem64 mmreg1 mmreg1 82h 0fh 73h 9ah 98h 02h 70h 07h f7h 00h 44h 2ch 77h 14h 63h 42h c1h 00h 53h ech 00h d2h 88h 4ah 63 63 63 0 0 0 indicates a saturated value
mmx? instruction set 93 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information psubusw mnemonic opcode description psubusw mmreg1, mmreg2/mem64 0f d9h subtract unsigned packed 16-bit values and saturate privilege: none registers affected: mmx flags affected: none exceptions generated: the psubusw instruction subtracts four unsigned 16-bit values in the source operand (an mmx register or a 64-bit memory location) from the four corresponding unsigned 16-bit values in the destination operand (an mmx register). if any 16-bit source value is greater than its corresponding 16-bit destination value, the result saturates to 0000h. the four unsigned 16-bit results are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
94 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the psubusw instruction the following list explains the functional illustration of the psubusw instruction: n the unsigned 16-bit value ec22h is subtracted from the unsigned 16-bit value 5321h, and the result saturates to 0000h because the source operand is greater than the destination operand. n the remaining operations are simple unsigned subtraction with no saturation. related instructions see the psubb instruction. see the psubd instruction. see the psubw instruction. see the psubsb instruction. see the psubsw instruction. see the psubusb instruction. ---- ==== mmreg2/mem64 mmreg1 mmreg1 ffffh d250h ffffh 0000h 7007h 0ff9h 600eh 5321h ec22h 0000h 8807h 4a49h 63 63 63 0 0 0 indicates a saturated value
mmx? instruction set 95 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information psubw mnemonic opcode description psubw mmreg1, mmreg2/mem64 0f f9h subtract unsigned packed 16-bit values with wraparound privilege: none registers affected: mmx flags affected: none exceptions generated: the psubw instruction subtracts four unsigned 16-bit values in the source operand (an mmx register or a 64-bit memory location) from the four corresponding unsigned 16-bit values in the destination operand (an mmx register). if the source operand is larger than the destination operand, the result wraps around. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
96 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the psubw instruction the following list explains the functional illustration of the psubw instruction: n the unsigned 16-bit value ec22h is subtracted from the unsigned 16-bit value 5321h and the result wraps around to 66ffh. n the remaining operations are simple unsigned subtraction with no saturation. related instructions see the psubb instruction. see the psubd instruction. see the psubsb instruction. see the psubsw instruction. see the psubusb instruction. see the psubusw instruction. ---- ==== mmreg2/mem64 mmreg1 mmreg1 ffffh d250h ffffh 0000h 7007h 0ff9h 600eh 5321h ec22h 66ffh 8807h 4a49h 63 63 63 0 0 0
mmx? instruction set 97 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information punpckhbw mnemonic opcode description punpckhbw mmreg1, mmreg2/mem64 0f 68h unpack the high 32 bits of packed 8-bit values privilege: none registers affected: mmx flags affected: none exceptions generated: the punpckhbw instruction unpacks and interleaves four 8-bit values from the high 32 bits of the source operand (an mmx register or a 64-bit memory location) and four 8-bit values from the high 32 bits of the destination operand (an mmx register). the 8-bit values from the source operand become the high 8 bits of the 16-bit results, and the 8-bit values from the destination operand become the low 8 bits of the 16-bit results. the eight interleaved 8-bit values are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
98 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the punpckhbw instruction in the following figure, the destination register is shown at the center to illustrate the flow of data from the two source operands. in the functional illustration of the punpckhbw instruction, the 8-bit values from mmreg1 are stored in the low-order 8 bits of the 16-bit result. the mmreg2/mem64 source operand is set to all zero bits so it can provide zero fill in the high-order 8 bits of the 16-bit result. this is a method that can be used to expand unsigned 8-bit values into unsigned 16-bit operands for subsequent processing that requires higher precision. related instructions see the packsswb instruction. see the packuswb instruction. see the psraw instruction. see the punpckhdq instruction. see the punpckhwd instruction. see the punpcklbw instruction. see the punpckldq instruction. see the punpcklwd instruction. source mmreg2/mem64 destination mmreg1 source mmreg1 00h 00h 88h 00h a8h 80h 00h 00h 00h feh 44h 06h 00h 00h 7fh 00h 80h a8h 00h 00h 44h 00h 88h 80h 63 63 63 0 0
mmx? instruction set 99 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information punpckhdq mnemonic opcode description punpckhdq mmreg1, mmreg2/mem64 0f 6ah unpack the high 32 bits of packed 32-bit values privilege: none registers affected: mmx flags affected: none exceptions generated: the punpckhdq instruction unpacks and interleaves the high 32 bits of the source operand (an mmx register or a 64-bit memory location) and the high 32 bits of the destination operand (an mmx register). the 32-bit value from the source operand becomes the high 32 bits of the 64-bit result, and the 32-bit value from the destination operand becomes the low 32 bits of the 64-bit result. the interleaved 32-bit values are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
100 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the punpckhdq instruction in the following figure, the destination register is shown at the center to illustrate the flow of data from the two source operands. in the functional illustration of the punpckhdq instruction, the 32-bit value from mmreg1 is stored in the low-order 32 bits of the 64-bit result. the mmreg2/mem64 source operand is set to all zero bits so it can provide zero fill in the high-order 32 bits of the 64-bit result. this is a method that can be used to expand unsigned 32-bit values into unsigned 64-bit operands for subsequent processing that requires higher precision. related instructions see the punpckhbw instruction. see the punpckhwd instruction. see the punpcklbw instruction. see the punpckldq instruction. see the punpcklwd instruction. source mmreg2/mem64 destination mmreg1 source mmreg1 0000_0000h 0000_0000h 8880_44a8h 0000_0000h 8880_44a8h 7f06_fe80h 63 63 0 0
mmx? instruction set 101 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information punpckhwd mnemonic opcode description punpckhwd mmreg1, mmreg2/mem64 0f 69h unpack the high 32 bits of packed 16-bit values privilege: none registers affected: mmx flags affected: none exceptions generated: the punpckhwd instruction unpacks and interleaves two 16-bit values from the high 32 bits of the source operand (an mmx register or a 64-bit memory location) and two 16-bit values from the high 32 bits of the destination operand (an mmx register). the 16-bit values from the source operand become the high 16 bits of the 32-bit results, and the 16-bit values from the destination operand become the low 16 bits of the 32-bit results. the four interleaved 16-bit values are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
102 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the punpckhwd instruction in the following figure, the destination register is shown at the center to illustrate the flow of data from the two source operands. in the functional illustration of the punpckhwd instruction, the 16-bit values from mmreg1 are stored in the low-order 16 bits of the 32-bit result. the 16-bit values from the mmreg2/mem64 source operand are stored in the high-order 16 bits of the 32-bit result. this is an example of the use of the punpckhwd instruction to assemble 32-bit operands from the high and low 16-bit results produced by the pmulhw and pmullw instructions. in this example, the high and low 16-bit results are interleaved to produce the signed 32-bit results 1569_4030h and f98c_7662h. related instructions see the packssdw instruction. see the psrad instruction. see the pmulhw instruction. see the pmullw instruction. see the punpckhbw instruction. see the punpckhdq instruction. see the punpcklbw instruction. see the punpckldq instruction. see the punpcklwd instruction. source mmreg2/mem64 destination mmreg1 source mmreg1 1569h 1569h 4030h 0000h 7662h 0001h 06fdh f98ch 5fcfh f98ch 4030h 7662h 63 63 0 0
mmx? instruction set 103 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information punpcklbw mnemonic opcode description punpcklbw mmreg1, mmreg2/mem32 0f 60h unpack the low 32-bits of packed 8-bit values privilege: none registers affected: mmx flags affected: none exceptions generated: the punpcklbw instruction unpacks and interleaves four 8-bit values from the low 32 bits of the source operand (an mmx register or a 32-bit memory location) and four 8-bit values from the low 32 bits of the destination operand (an mmx register). the 8-bit values from the source operand become the high 8 bits of the 16-bit results, and the 8-bit values from the destination operand become the low 8 bits of the 16-bit results. the eight interleaved 8-bit values are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
104 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the punpcklbw instruction in the following figure, the destination register is shown at the center to illustrate the flow of data from the two source operands. in the functional illustration of the punpcklbw instruction, the 8-bit values from mmreg1 are stored in the low-order 8 bits of the 16-bit result. the mmreg2/mem32 source operand is set to all zero bits so it can provide zero fill in the high-order 8 bits of the 16-bit result. this is a method that can be used to expand unsigned 8-bit values source mmreg2 destination mmreg1 source mmreg1 00h 00h 88h 00h 80h 80h 00h 00h 00h feh feh 06h 00h 00h 7fh 00h 06h a8h 00h 00h 44h 00h 7fh 80h 63 63 0 0 00h 00h 00h 00h 0 source mem32 destination mmreg1 source mmreg1 00h 80h 00h feh 00h 06h 00h 7fh 31 88h 80h feh 06h 7fh a8h 44h 80h 63 0
mmx? instruction set 105 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information into unsigned 16-bit operands for subsequent processing that requires higher precision. related instructions see the packsswb instruction. see the packuswb instruction. see the psraw instruction. see the punpckhbw instruction see the punpckhdq instruction. see the punpckhwd instruction. see the punpckldq instruction. see the punpcklwd instruction.
10 6 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information punpckldq mnemonic opcode description punpckldq mmreg1, mmreg2/mem32 0f 62h unpack the low 32 bits of packed 32-bit values privilege: none registers affected: mmx flags affected: none exceptions generated: the punpckldq instruction unpacks and interleaves the low 32 bits of the source operand (an mmx register or a 32-bit memory location) and the low 32 bits of the destination operand (an mmx register). the 32-bit value from the source operand becomes the high 32 bits of the 64-bit result, and the 32-bit value from the destination operand becomes the low 32 bits of the 64-bit result. the interleaved 32-bit values are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
mmx? instruction set 107 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information functional illustration of the punpckldq instruction in the following figure, the destination register is shown at the center to illustrate the flow of data from the two source operands. in the functional illustration of the punpckldq instruction, the 32-bit value from mmreg1 is stored in the low-order 32 bits of the 64-bit result. the mmreg2/mem32 source operand is set to all zero bits so it can provide zero fill in the high-order 32 bits of the 64-bit result. this is a method that can be used to expand unsigned 32-bit values into unsigned 64-bit operands for subsequent processing that requires higher precision. source mmreg2 destination mmreg1 source mmreg1 0000_0000h 0000_0000h 8880_44a8h 0000_0000h 7f06_fe80h 7f06_fe80h 63 63 0 0 source mem32 destination mmreg1 source mmreg1 0000_0000h 0000_0000h 7f06_fe80h 31 0 8880_44a8h 7f06_fe80h 63 0
10 8 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information related instructions see the punpckhbw instruction. see the punpckhdq instruction. see the punpckhwd instruction. see the punpcklbw instruction. see the punpcklwd instruction.
mmx? instruction set 109 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information punpcklwd mnemonic opcode description punpcklwd mmreg1, mmreg2/mem32 0f 61h unpack the low 32 bits of packed 16-bit values privilege: none registers affected: mmx flags affected: none exceptions generated: the punpcklwd instruction unpacks and interleaves two 16-bit values from the low 32 bits of the source operand (an mmx register or a 32-bit memory location) and two 16-bit values from the low 32 bits of the destination operand (an mmx register). the 16-bit values from the source operand become the high 16 bits of the 32-bit results, and the 16-bit values from the destination operand become the low 16 bits of the 32-bit results. the four interleaved 16-bit values are stored in the mmx register specified as the destination operand. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
110 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information functional illustration of the punpcklwd instruction in the following figure, the destination register is shown at the center to illustrate the flow of data from the two source operands. in the functional illustration of the punpcklwd instruction, the 16-bit values from mmreg1 are stored in the low-order 16 bits of the 32-bit result. the 16-bit values from the mmreg2/mem32 source operand are stored in the high-order 16 bits of the 32-bit result. this is an example of the use of the punpcklwd instruction to assemble 32-bit operands from the high and low 16-bit results produced by the pmulhw and pmullw instructions. in this example, the high and low 16-bit results are interleaved to produce the signed 32-bit results 06fd_5fcfh and 0000_0001h. source mmreg2 destination mmreg1 source mmreg1 1569h 06fdh 4030h 0000h 0001h 0001h 06fdh 0000h 5fcfh f98ch 5fcfh 7662h 63 63 0 0 0000h 06fdh 0 source mem32 destination mmreg1 source mmreg1 06fdh 0001h 0000h 5fcfh 31 31 4030h 0001h 5fcfh 7662h 63 0
mmx? instruction set 111 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information related instructions see the packsswd instruction. see the psrad instruction. see the pmulhw instruction. see the pmullw instruction. see the punpckhbw instruction. see the punpckhdq instruction. see the punpckhwd instruction. see the punpcklbw instruction. see the punpckldq instruction.
112 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information pxor mnemonic opcode description pxor mmreg1, mmreg2/mem64 0f efh xor 64-bit values privilege: none registers affected: mmx flags affected: none exceptions generated: the pxor instruction logically xors the 64 bits of the source operand (an mmx register or a 64-bit memory location) with the 64 bits of the destination operand (an mmx register) and stores the result in the destination register. a logical xor produces a 1 bit if only one of the two input bits is a 1. if both input bits are 0 or both input bits are 1, a logical xor produces a 0 bit. exception real virtual 8086 protected description invalid opcode (6) x x x the emulate mmx instruction bit (em) of the control register (cr0) is set to 1. device not available (7) x x x save the floating-point or mmx state if the task switch bit (ts) of the control register (cr0) is set to 1. stack exception (12) x during instruction execution, the stack segment limit was exceeded. general protection (13) x during instruction execution, the effective address of one of the segment registers used for the operand points to an illegal memory location. segment overrun (13) x x one of the instruction data operands falls outside the address range 00000h to 0ffffh. page fault (14) x x a page fault resulted from the execution of the instruction. floating-point exception pending (16) x x x an exception is pending due to the floating-point execution unit. alignment check (17) x x an unaligned memory reference resulted from the instruction execution, and the alignment mask bit (am) of the control register (cr0) is set to 1. (in protected mode, cpl = 3.)
mmx? instruction set 113 20726d/0january 2000 amd-k6? mmx? enhanced processor multimedia technology preliminary information functional illustration of the pxor instruction in the functional illustration of the pxor instruction, the 64-bit source value is logically xord to the 64-bit destination value, and the result is stored in the destination register. related instructions see the pand instruction. see the pandn instruction. see the por instruction. 0101_1100_1100_0011 1100_1101_0100_1110 1011_0001_0011_1001 0110_0011_0101_1001 1111_0011_1100_1110 1100_0010_0100_0001 0111_0000_0000_1000 1110_1111_1000_1000 mmreg1 0 32 31 63 47 48 15 16 0 32 31 63 47 48 15 16 mmreg2/mem64 logical or logical or logical or logical or mmreg1 0 32 31 63 47 48 15 16 result 1010_1111_0000_1101 0000_1111_0000_1111 1100_0001_0011_0001 1000_1100_1101_0011
114 mmx? instruction set amd-k6? mmx? enhanced processor multimedia technology 20726d/0january 2000 preliminary information

▲Up To Search▲

Price & Availability of 20726

	To Download 20726 Datasheet File
If you can't view the Datasheet, Please click here to try to view without PDF Reader .